Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 48
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Metabolomics ; 20(1): 15, 2024 Jan 24.
Artigo em Inglês | MEDLINE | ID: mdl-38267595

RESUMO

INTRODUCTION: Lipids are key compounds in the study of metabolism and are increasingly studied in biology projects. It is a very broad family that encompasses many compounds, and the name of the same compound may vary depending on the community where they are studied. OBJECTIVES: In addition, their structures are varied and complex, which complicates their analysis. Indeed, the structural resolution does not always allow a complete level of annotation so the actual compound analysed will vary from study to study and should be clearly stated. For all these reasons the identification and naming of lipids is complicated and very variable from one study to another, it needs to be harmonized. METHODS & RESULTS: In this position paper we will present and discuss the different way to name lipids (with chemoinformatic and semantic identifiers) and their importance to share lipidomic results. CONCLUSION: Homogenising this identification and adopting the same rules is essential to be able to share data within the community and to map data on functional networks.


Assuntos
Lipidômica , Metabolômica , Lipídeos
2.
Nucleic Acids Res ; 52(D1): D817-D821, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37897348

RESUMO

ViralZone (http://viralzone.expasy.org) is a knowledge repository for viruses that links biological knowledge and databases. It contains data on virion structure, genome, proteome, replication cycle and host-virus interactions. The new update provides better access to the data through contextual popups and higher resolution images in Scalable Vector Graphics (SVG) format. These images are designed to be dynamic and interactive with human viruses to give users better access to the data. In addition, a new coronavirus-specific resource provides regularly updated data on variants and molecular biology of SARS-CoV-2. Other virus-specific resources have been added to the database, particularly for HIV, herpesviruses and poxviruses.


Assuntos
Bases de Conhecimento , Vírus , Humanos , Vírion/química , Vírion/genética , Vírion/crescimento & desenvolvimento , Vírus/química , Vírus/genética , Vírus/crescimento & desenvolvimento
3.
Nat Methods ; 20(2): 193-204, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-36543939

RESUMO

Progress in mass spectrometry lipidomics has led to a rapid proliferation of studies across biology and biomedicine. These generate extremely large raw datasets requiring sophisticated solutions to support automated data processing. To address this, numerous software tools have been developed and tailored for specific tasks. However, for researchers, deciding which approach best suits their application relies on ad hoc testing, which is inefficient and time consuming. Here we first review the data processing pipeline, summarizing the scope of available tools. Next, to support researchers, LIPID MAPS provides an interactive online portal listing open-access tools with a graphical user interface. This guides users towards appropriate solutions within major areas in data processing, including (1) lipid-oriented databases, (2) mass spectrometry data repositories, (3) analysis of targeted lipidomics datasets, (4) lipid identification and (5) quantification from untargeted lipidomics datasets, (6) statistical analysis and visualization, and (7) data integration solutions. Detailed descriptions of functions and requirements are provided to guide customized data analysis workflows.


Assuntos
Biologia Computacional , Lipidômica , Biologia Computacional/métodos , Software , Informática , Lipídeos/química
4.
Bioinformatics ; 39(1)2023 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-36484697

RESUMO

MOTIVATION: To provide high quality, computationally tractable annotation of binding sites for biologically relevant (cognate) ligands in UniProtKB using the chemical ontology ChEBI (Chemical Entities of Biological Interest), to better support efforts to study and predict functionally relevant interactions between protein sequences and structures and small molecule ligands. RESULTS: We structured the data model for cognate ligand binding site annotations in UniProtKB and performed a complete reannotation of all cognate ligand binding sites using stable unique identifiers from ChEBI, which we now use as the reference vocabulary for all such annotations. We developed improved search and query facilities for cognate ligands in the UniProt website, REST API and SPARQL endpoint that leverage the chemical structure data, nomenclature and classification that ChEBI provides. AVAILABILITY AND IMPLEMENTATION: Binding site annotations for cognate ligands described using ChEBI are available for UniProtKB protein sequence records in several formats (text, XML and RDF) and are freely available to query and download through the UniProt website (www.uniprot.org), REST API (www.uniprot.org/help/api), SPARQL endpoint (sparql.uniprot.org/) and FTP site (https://ftp.uniprot.org/pub/databases/uniprot/). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Bases de Conhecimento , Bases de Dados de Proteínas , Ligantes , Sequência de Aminoácidos , Sítios de Ligação , Anotação de Sequência Molecular
5.
Nucleic Acids Res ; 51(D1): D418-D427, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36350672

RESUMO

The InterPro database (https://www.ebi.ac.uk/interpro/) provides an integrative classification of protein sequences into families, and identifies functionally important domains and conserved sites. Here, we report recent developments with InterPro (version 90.0) and its associated software, including updates to data content and to the website. These developments extend and enrich the information provided by InterPro, and provide a more user friendly access to the data. Additionally, we have worked on adding Pfam website features to the InterPro website, as the Pfam website will be retired in late 2022. We also show that InterPro's sequence coverage has kept pace with the growth of UniProtKB. Moreover, we report the development of a card game as a method of engaging the non-scientific community. Finally, we discuss the benefits and challenges brought by the use of artificial intelligence for protein structure prediction.


Assuntos
Bases de Dados de Proteínas , Humanos , Sequência de Aminoácidos , Inteligência Artificial , Internet , Proteínas/química , Software
6.
Database (Oxford) ; 20222022 08 12.
Artigo em Inglês | MEDLINE | ID: mdl-35961013

RESUMO

Over the last 25 years, biology has entered the genomic era and is becoming a science of 'big data'. Most interpretations of genomic analyses rely on accurate functional annotations of the proteins encoded by more than 500 000 genomes sequenced to date. By different estimates, only half the predicted sequenced proteins carry an accurate functional annotation, and this percentage varies drastically between different organismal lineages. Such a large gap in knowledge hampers all aspects of biological enterprise and, thereby, is standing in the way of genomic biology reaching its full potential. A brainstorming meeting to address this issue funded by the National Science Foundation was held during 3-4 February 2022. Bringing together data scientists, biocurators, computational biologists and experimentalists within the same venue allowed for a comprehensive assessment of the current state of functional annotations of protein families. Further, major issues that were obstructing the field were identified and discussed, which ultimately allowed for the proposal of solutions on how to move forward.


Assuntos
Genômica , Proteínas , Sequência de Bases , Biologia Computacional , Genoma , Anotação de Sequência Molecular
7.
Database (Oxford) ; 20222022 04 12.
Artigo em Inglês | MEDLINE | ID: mdl-35411389

RESUMO

SwissBioPics (www.swissbiopics.org) is a freely available resource of interactive, high-resolution cell images designed for the visualization of subcellular location data. SwissBioPics provides images describing cell types from all kingdoms of life-from the specialized muscle, neuronal and epithelial cells of animals, to the rods, cocci, clubs and spirals of prokaryotes. All cell images in SwissBioPics are drawn in Scalable Vector Graphics (SVG), with each subcellular location tagged with a unique identifier from the controlled vocabulary of subcellular locations and organelles of UniProt (https://www.uniprot.org/locations/). Users can search and explore SwissBioPics cell images through our website, which provides a platform for users to learn more about how cells are organized. A web component allows developers to embed SwissBioPics images in their own websites, using the associated JavaScript and a styling template, and to highlight subcellular locations and organelles by simply providing the web component with the appropriate identifier(s) from the UniProt-controlled vocabulary or the 'Cellular Component' branch of the Gene Ontology (www.geneontology.org), as well as an organism identifier from the National Center for Biotechnology Information taxonomy (https://www.ncbi.nlm.nih.gov/taxonomy). The UniProt website now uses SwissBioPics to visualize the subcellular locations and organelles where proteins function. SwissBioPics is freely available for anyone to use under a Creative Commons Attribution 4.0 International (CC BY 4.0) license. DATABASE URL: www.swissbiopics.org.


Assuntos
Proteínas , Vocabulário Controlado , Animais
8.
Nucleic Acids Res ; 50(D1): D693-D700, 2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34755880

RESUMO

Rhea (https://www.rhea-db.org) is an expert-curated knowledgebase of biochemical reactions based on the chemical ontology ChEBI (Chemical Entities of Biological Interest) (https://www.ebi.ac.uk/chebi). In this paper, we describe a number of key developments in Rhea since our last report in the database issue of Nucleic Acids Research in 2019. These include improved reaction coverage in Rhea, the adoption of Rhea as the reference vocabulary for enzyme annotation in the UniProt knowledgebase UniProtKB (https://www.uniprot.org), the development of a new Rhea website, and the designation of Rhea as an ELIXIR Core Data Resource. We hope that these and other developments will enhance the utility of Rhea as a reference resource to study and engineer enzymes and the metabolic systems in which they function.


Assuntos
Fenômenos Químicos , Bases de Dados Factuais , Software , Animais , Humanos , Internet , Bases de Conhecimento
9.
Metabolomics ; 17(6): 55, 2021 06 06.
Artigo em Inglês | MEDLINE | ID: mdl-34091802

RESUMO

BACKGROUND: Improvements in mass spectrometry (MS) technologies coupled with bioinformatics developments have allowed considerable advancement in the measurement and interpretation of lipidomics data in recent years. Since research areas employing lipidomics are rapidly increasing, there is a great need for bioinformatic tools that capture and utilize the complexity of the data. Currently, the diversity and complexity within the lipidome is often concealed by summing over or averaging individual lipids up to (sub)class-based descriptors, losing valuable information about biological function and interactions with other distinct lipids molecules, proteins and/or metabolites. AIM OF REVIEW: To address this gap in knowledge, novel bioinformatics methods are needed to improve identification, quantification, integration and interpretation of lipidomics data. The purpose of this mini-review is to summarize exemplary methods to explore the complexity of the lipidome. KEY SCIENTIFIC CONCEPTS OF REVIEW: Here we describe six approaches that capture three core focus areas for lipidomics: (1) lipidome annotation including a resolvable database identifier, (2) interpretation via pathway- and enrichment-based methods, and (3) understanding complex interactions to emphasize specific steps in the analytical process and highlight challenges in analyses associated with the complexity of lipidome data.


Assuntos
Biologia Computacional , Lipidômica , Bases de Dados Factuais , Lipídeos , Espectrometria de Massas
10.
Metabolites ; 11(1)2021 Jan 12.
Artigo em Inglês | MEDLINE | ID: mdl-33445429

RESUMO

The UniProt Knowledgebase UniProtKB is a comprehensive, high-quality, and freely accessible resource of protein sequences and functional annotation that covers genomes and proteomes from tens of thousands of taxa, including a broad range of plants and microorganisms producing natural products of medical, nutritional, and agronomical interest. Here we describe work that enhances the utility of UniProtKB as a support for both the study of natural products and for their discovery. The foundation of this work is an improved representation of natural product metabolism in UniProtKB using Rhea, an expert-curated knowledgebase of biochemical reactions, that is built on the ChEBI (Chemical Entities of Biological Interest) ontology of small molecules. Knowledge of natural products and precursors is captured in ChEBI, enzyme-catalyzed reactions in Rhea, and enzymes in UniProtKB/Swiss-Prot, thereby linking chemical structure data directly to protein knowledge. We provide a practical demonstration of how users can search UniProtKB for protein knowledge relevant to natural products through interactive or programmatic queries using metabolite names and synonyms, chemical identifiers, chemical classes, and chemical structures and show how to federate UniProtKB with other data and knowledge resources and tools using semantic web technologies such as RDF and SPARQL. All UniProtKB data are freely available for download in a broad range of formats for users to further mine or exploit as an annotation source, to enrich other natural product datasets and databases.

11.
Nucleic Acids Res ; 49(D1): D344-D354, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33156333

RESUMO

The InterPro database (https://www.ebi.ac.uk/interpro/) provides an integrative classification of protein sequences into families, and identifies functionally important domains and conserved sites. InterProScan is the underlying software that allows protein and nucleic acid sequences to be searched against InterPro's signatures. Signatures are predictive models which describe protein families, domains or sites, and are provided by multiple databases. InterPro combines signatures representing equivalent families, domains or sites, and provides additional information such as descriptions, literature references and Gene Ontology (GO) terms, to produce a comprehensive resource for protein classification. Founded in 1999, InterPro has become one of the most widely used resources for protein family annotation. Here, we report the status of InterPro (version 81.0) in its 20th year of operation, and its associated software, including updates to database content, the release of a new website and REST API, and performance improvements in InterProScan.


Assuntos
Bases de Dados de Proteínas , Proteínas/química , Sequência de Aminoácidos , COVID-19/metabolismo , Internet , Anotação de Sequência Molecular , Domínios Proteicos , Mapas de Interação de Proteínas , SARS-CoV-2/metabolismo , Alinhamento de Sequência
12.
Nat Commun ; 11(1): 6144, 2020 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-33262342

RESUMO

The International Molecular Exchange (IMEx) Consortium provides scientists with a single body of experimentally verified protein interactions curated in rich contextual detail to an internationally agreed standard. In this update to the work of the IMEx Consortium, we discuss how this initiative has been working in practice, how it has ensured database sustainability, and how it is meeting emerging annotation challenges through the introduction of new interactor types and data formats. Additionally, we provide examples of how IMEx data are being used by biomedical researchers and integrated in other bioinformatic tools and resources.


Assuntos
Acesso à Informação , Bases de Dados Genéticas , Humanos , Disseminação de Informação , Cooperação Internacional
14.
Gigascience ; 9(2)2020 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-32034905

RESUMO

BACKGROUND: Genome and proteome annotation pipelines are generally custom built and not easily reusable by other groups. This leads to duplication of effort, increased costs, and suboptimal annotation quality. One way to address these issues is to encourage the adoption of annotation standards and technological solutions that enable the sharing of biological knowledge and tools for genome and proteome annotation. RESULTS: Here we demonstrate one approach to generate portable genome and proteome annotation pipelines that users can run without recourse to custom software. This proof of concept uses our own rule-based annotation pipeline HAMAP, which provides functional annotation for protein sequences to the same depth and quality as UniProtKB/Swiss-Prot, and the World Wide Web Consortium (W3C) standards Resource Description Framework (RDF) and SPARQL (a recursive acronym for the SPARQL Protocol and RDF Query Language). We translate complex HAMAP rules into the W3C standard SPARQL 1.1 syntax, and then apply them to protein sequences in RDF format using freely available SPARQL engines. This approach supports the generation of annotation that is identical to that generated by our own in-house pipeline, using standard, off-the-shelf solutions, and is applicable to any genome or proteome annotation pipeline. CONCLUSIONS: HAMAP SPARQL rules are freely available for download from the HAMAP FTP site, ftp://ftp.expasy.org/databases/hamap/sparql/, under the CC-BY-ND 4.0 license. The annotations generated by the rules are under the CC-BY 4.0 license. A tutorial and supplementary code to use HAMAP as SPARQL are available on GitHub at https://github.com/sib-swiss/HAMAP-SPARQL, and general documentation about HAMAP can be found on the HAMAP website at https://hamap.expasy.org.


Assuntos
Genômica/métodos , Anotação de Sequência Molecular/métodos , Análise de Sequência de DNA/métodos , Análise de Sequência de Proteína/métodos , Software/normas , Animais , Genômica/normas , Humanos , Anotação de Sequência Molecular/normas , Análise de Sequência de DNA/normas , Análise de Sequência de Proteína/normas
15.
Bioinformatics ; 36(6): 1896-1901, 2020 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-31688925

RESUMO

MOTIVATION: To provide high quality computationally tractable enzyme annotation in UniProtKB using Rhea, a comprehensive expert-curated knowledgebase of biochemical reactions which describes reaction participants using the ChEBI (Chemical Entities of Biological Interest) ontology. RESULTS: We replaced existing textual descriptions of biochemical reactions in UniProtKB with their equivalents from Rhea, which is now the standard for annotation of enzymatic reactions in UniProtKB. We developed improved search and query facilities for the UniProt website, REST API and SPARQL endpoint that leverage the chemical structure data, nomenclature and classification that Rhea and ChEBI provide. AVAILABILITY AND IMPLEMENTATION: UniProtKB at https://www.uniprot.org; UniProt REST API at https://www.uniprot.org/help/api; UniProt SPARQL endpoint at https://sparql.uniprot.org/; Rhea at https://www.rhea-db.org.


Assuntos
Reiformes , Animais , Bases de Dados de Proteínas , Bases de Conhecimento
16.
F1000Res ; 82019.
Artigo em Inglês | MEDLINE | ID: mdl-31824649

RESUMO

Intrinsically disordered proteins (IDPs) and intrinsically disordered regions (IDRs) are now recognised as major determinants in cellular regulation. This white paper presents a roadmap for future e-infrastructure developments in the field of IDP research within the ELIXIR framework. The goal of these developments is to drive the creation of high-quality tools and resources to support the identification, analysis and functional characterisation of IDPs. The roadmap is the result of a workshop titled "An intrinsically disordered protein user community proposal for ELIXIR" held at the University of Padua. The workshop, and further consultation with the members of the wider IDP community, identified the key priority areas for the roadmap including the development of standards for data annotation, storage and dissemination; integration of IDP data into the ELIXIR Core Data Resources; and the creation of benchmarking criteria for IDP-related software. Here, we discuss these areas of priority, how they can be implemented in cooperation with the ELIXIR platforms, and their connections to existing ELIXIR Communities and international consortia. The article provides a preliminary blueprint for an IDP Community in ELIXIR and is an appeal to identify and involve new stakeholders.


Assuntos
Proteínas Intrinsicamente Desordenadas/metabolismo
18.
Metabolites ; 9(5)2019 Apr 30.
Artigo em Inglês | MEDLINE | ID: mdl-31052310

RESUMO

: Steroidomics studies face the challenge of separating analytical compounds with very similar structures (i.e., isomers). Liquid chromatography (LC) is commonly used to this end, but the shared core structure of this family of compounds compromises effective separations among the numerous chemical analytes with comparable physico-chemical properties. Careful tuning of the mobile phase gradient and an appropriate choice of the stationary phase can be used to overcome this problem, in turn modifying the retention times in different ways for each compound. In the usual workflow, this approach is suboptimal for the annotation of features based on retention times since it requires characterizing a library of known compounds for every fine-tuned configuration. We introduce a software solution, DynaStI, that is capable of annotating liquid chromatography-mass spectrometry (LC-MS) features by dynamically generating the retention times from a database containing intrinsic properties of a library of metabolites. DynaStI uses the well-established linear solvent strength (LSS) model for reversed-phase LC. Given a list of LC-MS features and some characteristics of the LC setup, this software computes the corresponding retention times for the internal database and then annotates the features using the exact masses with predicted retention times at the working conditions. DynaStI (https://dynasti.vital-it.ch) is able to automatically calibrate its predictions to compensate for deviations in the input parameters. The database also includes identification and structural information for each annotation, such as IUPAC name, CAS number, SMILES string, metabolic pathways, and links to external metabolomic or lipidomic databases.

19.
Nucleic Acids Res ; 47(D1): D596-D600, 2019 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-30272209

RESUMO

Rhea (http://www.rhea-db.org) is a comprehensive and non-redundant resource of over 11 000 expert-curated biochemical reactions that uses chemical entities from the ChEBI ontology to represent reaction participants. Originally designed as an annotation vocabulary for the UniProt Knowledgebase (UniProtKB), Rhea also provides reaction data for a range of other core knowledgebases and data repositories including ChEBI and MetaboLights. Here we describe recent developments in Rhea, focusing on a new resource description framework representation of Rhea reaction data and an SPARQL endpoint (https://sparql.rhea-db.org/sparql) that provides access to it. We demonstrate how federated queries that combine the Rhea SPARQL endpoint and other SPARQL endpoints such as that of UniProt can provide improved metabolite annotation and support integrative analyses that link the metabolome through the proteome to the transcriptome and genome. These developments will significantly boost the utility of Rhea as a means to link chemistry and biology for a more holistic understanding of biological systems and their function in health and disease.


Assuntos
Bases de Dados de Compostos Químicos , Bases de Dados de Proteínas , Metabolômica/métodos , Software/normas , Humanos , Bases de Conhecimento , Biologia de Sistemas/métodos
20.
Nucleic Acids Res ; 47(D1): D351-D360, 2019 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-30398656

RESUMO

The InterPro database (http://www.ebi.ac.uk/interpro/) classifies protein sequences into families and predicts the presence of functionally important domains and sites. Here, we report recent developments with InterPro (version 70.0) and its associated software, including an 18% growth in the size of the database in terms on new InterPro entries, updates to content, the inclusion of an additional entry type, refined modelling of discontinuous domains, and the development of a new programmatic interface and website. These developments extend and enrich the information provided by InterPro, and provide greater flexibility in terms of data access. We also show that InterPro's sequence coverage has kept pace with the growth of UniProtKB, and discuss how our evaluation of residue coverage may help guide future curation activities.


Assuntos
Bases de Dados de Proteínas , Anotação de Sequência Molecular , Animais , Bases de Dados Genéticas , Ontologia Genética , Humanos , Internet , Família Multigênica , Domínios Proteicos/genética , Homologia de Sequência de Aminoácidos , Software , Interface Usuário-Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...